Unsupervised learning of derivational morphology

نویسنده

  • Eric Gaussier
چکیده

We present in this paper an unsupervised method to learn suuxes and suuxation operations from an innectional lexicon of a language. The elements acquired with our method are used to build stemming procedures and can assist lexicographers in the development of new lexical resources.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Towards a Malay Derivational Lexicon: Learning Affixes Using Expectation Maximization

We propose an unsupervised training method to guide the learning of Malay derivational morphology from a set of morphological segmentations produced by a naı̈ve morphological analyzer. Using a morphology-based language model, we first estimate the probability of a given segmentation. We train the model with EM to find the segmentation that maximizes the probability of each morpheme. We extract t...

متن کامل

DerivBase.hr: A High-Coverage Derivational Morphology Resource for Croatian

Knowledge about derivational morphology has been proven useful for a number of natural language processing (NLP) tasks. We describe the construction and evaluation of DERIVBASE.HR, a large-coverage morphological resource for Croatian. DERIVBASE.HR groups 100k lemmas from web corpus hrWaC into 56k clusters of derivationally related lemmas, so-called derivational families. We focus on suffixal de...

متن کامل

A Framework for Learning Morphology using Suffix Association Matrix

Unsupervised learning of morphology is used for automatic affix identification, morphological segmentation of words and generating paradigms which give a list of all affixes that can be combined with a list of stems. Various unsupervised approaches are used to segment words into stem and suffix. Most unsupervised methods used to learn morphology assume that suffixes occur frequently in a corpus...

متن کامل

A Language-independent Approach to Extracting Derivational Relations from an Inflectional Lexicon

In this paper, we describe and evaluate an unsupervised method for acquiring pairs of lexical entries belonging to the same morphological family, i.e., derivationally related words, starting from a purely inflectional lexicon. Our approach relies on transformation rules that relate lexical entries with the one another, and which are automatically extracted from the inflected lexicon based on su...

متن کامل

From Signatures to Finite State Automata

In this paper, we outline the design of a nondeterministic finite state automaton (NFSA) for natural language morphology, and compare it to previous work in unsupervised learning of morphology. In Section 2, we describe the nature of an MDL-based system for unsupervised learning of morphology, using the signature-based model of Goldsmith 2001 as an example, and we describe some drawbacks of the...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1999